home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
dskut
/
scan21.zip
/
SCAN.DOC
< prev
next >
Wrap
Text File
|
1988-07-05
|
7KB
|
139 lines
SCAN - A utility to scan binary files for text strings.
USAGE: SCAN +<on> -<off> {infile} {outfile}
where:
+<on> List of switches to activate.
-<off> List of switches to deactivate.
{infile} Name of input file.
{outfile} Name of output file.
You may use / or \ in pathnames if they are
supplied as parameters (as opposed to command-
line redirection)
The minimum length a string must be to be printed may be changed
by including a digit (1-9) in any of the - or + switch sets. The
last digit given is used.
SCAN ? will present a brief list of options.
Arguments may be in any order. To give an output file, you must
give an input file. If no file names are given, SCAN uses
standard input & standard output so you may also use command-line
redirection and pipes.
WHAT IT DOES: SCAN reads a binary file and prints any text
strings it finds. A text string is defined (by default) as:
1) All characters in the printable ASCII range--SPACE through
~. The tab character (^I) is also considered printable.
2) At least four characters long. May be overridden with the
1-9 character length switches.
Any strings found are printed to standard output or the specified
output file, one per line.
OPTIONS: A number of options modify SCAN's filtering:
H Treat character values 128-255 as printable.
? Treat foreign (high bit) characters as printable.
G Treat graphics characters as printable (the set from
ASCII 176 to 223 decimal).
# Print characters above ASCII 127 as '\<ASCII>'.
. Print characters above ASCII 127 as '.'.
% Strip high bits from input.
* { Strip leading spaces from strings.
} Strip trailing spaces from strings.
* S Maximum output string length is (+)72
or (-)255 characters.
U Translate output to upper case.
B Display space as '_'.
E String must contain at least one consonant
and one vowel (a,e,i,o,u,y) to be printed.
$ Display ESCape as '\$'. ESC becomes printable.
C Display CR as '\C'. CR becomes printable.
L Display LF as '\L'. LF becomes printable.
F Display FF as '\F'. FF becomes printable.
* T Make TAB a printable character.
@ Display tabs as '\T' (implies T).
\ Display \ as '\\'. Useful with $, @, C, L, F.
0 Display NULLs (as '\0'). A null still ends a string.
! String must end in NULL to be displayed (does NOT
imply 0--Set 0 if you want to SEE nulls).
1-9 Minimum length for a printable string is set to
the value given (treated identically whether - or +).
* means the option is turned ON by default. Otherwise
it is turned OFF.
"Printable" means a character is included in a text string.
"Non-printable" means a character isn't included, and ENDS
a text string if encountered.
NOTES
* Turning on @, $, C, L, F, * or 0 turns on \ (you may over-
ride this by explicitly turning off \ later in the
command line).
* Turning on H turns off ? and G. Turning on ? or G turns
off H.
* # has precedence over .. If both are turned off, hi-bit
characters are printed directly (note you must turn on H,
?, or G to get any hi-bit characters).
* Setting % turns off H, ?, and G.
* Option U handles the international characters for those 6 or
so characters which have both upper- and lower-case versions.
EXAMPLES:
SCAN foo.doc -{} +% -- Read FOO.DOC, leaving leading & trailing
spaces, and stripping high bits in case
it's a Wordstar file
SCAN +!e -t nethack.exe -- Find all the spoilers in NETHACK :-)
BUGS:
Don't invoke SCAN without specifying some input file--SCAN opens
stdin as a binary file and will ignore control-Z, so you'll have
to hit ctrl-C to break out of the program.
Extra file names on the command line are totally ignored.
Wildcards in the input file name are not supported.
AUTHOR: Kenneth J. Herron
111 Buchanan Street
Lexington, Kentucky 40508
Like the program says, SCAN is placed in the public domain.
HISTORY:
1.0: (12/??/87) First working version.
1.5: (12/23/87) First released version.
1.6: (3/22/88) Added help screen. Made some minor changes to
improve execution speed about 8-10%. Fixed bug in which
only the first 255 characters of a too-long string were
saved. Fixed bug in which the minimum length was set to 3,
rather than the declared default minimum, if a / switch is
given with an unintelligible argument.
2.0 (5/1/88) Removed the 'strip high bits on output' option--
couldn't think of any use for it.
Made the digits 1-9 into switches for minimum string
length. Previously, minimum length was set with a special
option flagged with /. This change allows using / in file
pathnames.
Changed the logic of the NULL-related switches and the
high-bit-related switches. Now you can select only null-
terminated strings without having to see the nulls.
Changed the characters used for some options, mostly
to make them mnemonic.
Added the short/long output lines option so long runs of
text don't break up so weird (now they break up a different
weird). Output is short by default.
Added the graphics set switch and the two hi-bit conversion
switches (Convert-to-dot and convert-to-ASCII).
2.1 (7/5/88) Following a USENET posting from Torsten Olsson of
Sweden, implemented an UPCASE function which handles the
PC's international characters. This function is written as
a unit and may be incorporated into any program
transparently.
Removed the greek characters from the foreign character
set and added the quote marks #174 & #175.